On Abstract Finite-State Morphology

نویسندگان

  • Ajit Narayanan
  • Lama Hashem
چکیده

Aspects of abstract finite-state morphology are introduced and demonstrated. The use of two-way finite automata for Arabic noun stem and verb root inflection leads to abstractions based on finite-state transition network topology as well as the form and content of network arcs. Nonconcatenative morphology is distinguished from concatenative morphology by its use of movement on the output tape rather than the input tape. The idea of specific automata for classes of inflection inheriting some or all of the nodes, arc form and arc content of the abstract automaton is also introduced. This can lead to novel linguistic generalities and applications, as well as advantages in terms of procedural efficiency and representation. 1 I n t r o d u c t i o n Finite-state approaches to morphology provide ways of analyzing surface forms by appealing to the notion of a finite-state transducer which in turn mimics an ordered set of rewrite rules. Instead of intermediate forms being introduced (as would happen if rewrite rules are used (e.g. [Narayanan and Mehdi, 1991] for Arabic morphology)), the finitestate transducer works on two tapes (one representing lexical structure, the other the surface structure) and switches states if the symbols currently being scanned on the two tapes match the conditions of the state transition. Following the distinction expressed by Kay [1987], two-level morphology is a specialization of finite-state morphology in that intermediate forms are not required even in the grammatical formalism (e.g. [Koskenniemi, 1983; Koskenniemi, 1984]). The only representations required are those for the lexical and surface forms, together with ways of mapping between the one and the other directly. Surface forms express the result of any spelling-change interactions between dictionary/lexicon primitives. A typical architecture of a two-level morphological system [Karttunen, 1983; Kataja and Koskenniemi, 1988] consists of a dictionary/lexicon component containing roots, stems, affixes and their co-occurrence restrictions, and an automaton component which codes for the mappings between dictionary/lexicon forms and surface realizations. One of the problems faced by two-level approaches was their handling of nonconcatenative morphology. The main difference between Semitic and nonSemitic languages is that inflectional patterns are not straightforwardly concatenative (where morphemes are simply concatenated with roots, stems and each other) but 'interdigitate' or 'intercalate', i.e. the alTLx pattern is distributed among the constituents of the root morpheme. For example, the Arabic root 'd_r_s' ('study') intercalates with the inflectional pattern '_u_i_' (perfect passive) to form the stem 'duris' ('was studied'), which in turn can be inflected to signify number and gender 1. This nonconcatenative aspect of Arabic can be problematic for a traditional twolevel approach which bypasses intermediate forms. The problem concerns the way roots, stems (roots for Arabic verbs, stems for Arabic nouns) and inflection patterns are represented and stored. It is obviously not practical to store all the possible inflected forms 1Modern written Arabic rarely marks the vowels (short vowels are marked by diacritics), in this case the 'u' and 'i' in 'duris', except in beginners' books on Arabic. The (text) realization has the form Mrs'.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Finite Queue with Two Types of Failures and Preemptive Priority (RESEARCH NOTE)

We study the single server queueing system with two types of Abstract failure to servicechannels including the preemptive priority to the repair of major failure. The units arrive at thesystem in a poisson fashion and are served exponenitally. The steady - state probabilities ofvarious states by using generating function have been obtained.

متن کامل

Using Mazurkiewicz Trace Languages for Partition-Based Morphology

Partition-based morphology is an approach of finite-state morphology where a grammar describes a special kind of regular relations, which split all the strings of a given tuple into the same number of substrings. They are compiled in finite-state machines. In this paper, we address the question of merging grammars using different partitionings into a single finite-state machine. A morphological...

متن کامل

Optimal Morphology

Optimal morphology (OM) is a finite state formalism tha t unifies concepts from Optimality Theory (OT, Prince ~: Smolensky, 1993) and Declarative Phonology (DP, Scobbie, Coleman Bird, 1996) to describe morphophonological alternations in inflectional morphology. Candidate sets are formalized by inviolable lexical constraints which map abstract morpheme signatures to allomorphs. Phonology is impl...

متن کامل

PREDICTION OF STATIC SOFTENING OF MICROALLOYED STEEL BY THE INTEGRATION OF FINITE ELEMENT MODEL WITH PHYSICALLY BASED STATE VARIABLE MODEL

  Abstract   Recovery and recrystallization phenomena and effects of microalloying elements on these phenomena are of great importance in designing thermomechanical processes of microalloyed steels. Thus, understanding and modeling of microstructure evolution during hot deformation leads to optimize the processing conditions and to improve the product properties.   In this study, finite element...

متن کامل

Optimizing the finite-state description of Estonian morphology

The research on modeling the Estonian morphology by finite state devices has been influenced mostly by (Koskenniemi, 1983), (Lauri Karttunen and Zaenen, 1992) and (Beesley and Karttunen, 2000). We have used lexical transducer combined with twolevel rules as a general model for describing Estonian morphology. As a novel approach we can emphasize the application of the rules to the both sides of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993